Fix tokenizer loading for GPT2 #757
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #752
Fix the issue with loading the tokenizer for 'gpt2'.
scrapegraphai/utils/tokenizer.py
GPT2TokenizerFast
in thenum_tokens_calculus
function.GPT2TokenizerFast
fromtransformers
.scrapegraphai/utils/tokenizers/tokenizer_ollama.py
num_tokens_ollama
function to handleGPT2TokenizerFast
.tests/graphs/smart_scraper_ollama_test.py
GPT2TokenizerFast
.For more details, open the Copilot Workspace session.